What is Reflexion?
Reflexion is a system designed to improve language-based agents using feedback from their actions. According to Shinn et al. (2023), it represents a new approach to "verbal" reinforcement learning, where an agent's memory and choices of language model parameters work together to enhance learning.

How Reflexion Works
Reflexion transforms feedback—whether it's in free-form language or numerical scores—from the environment into a type of feedback called "self-reflection." This self-reflection serves as context for the language model agent in its next task. By using this method, the agent can quickly learn from past mistakes, leading to better performance in various tasks.

Key Components of the Reflexion Framework

Reflexion consists of three main models:

1. The Actor: This component creates text and takes actions based on what it observes in its environment. It generates a sequence of actions and outcomes (a trajectory) while using strategies like Chain-of-Thought (CoT) and ReAct models. Additionally, a memory feature provides extra context for the agent.

2. The Evaluator: This part assesses the Actor's outputs. It evaluates the generated trajectory (or short-term memory) and assigns a reward score based on the quality of the output. Different reward functions are used depending on the specific task, utilizing both language models and rule-based methods for decision-making.

3. Self-Reflection: This component generates verbal feedback to help the Actor improve. It uses the language model to provide useful insights for future tasks. By considering the reward signals, the current trajectory, and past experiences stored in memory, this model helps the agent enhance its decision-making skills over time.

Reflexion Process Steps
The Reflexion process involves the following steps:  
1. Define a task.  
2. Generate a trajectory.  
3. Evaluate the trajectory.  
4. Reflect on the performance.  
5. Create the next trajectory.

The framework allows agents to iteratively refine their behavior across various tasks, including decision-making, programming, and reasoning. Reflexion builds on the ReAct framework by adding features for self-evaluation, self-reflection, and memory.

Performance Results

Experimental Findings 
Experiments show that agents using Reflexion significantly outperform others on various tasks, such as decision-making in AlfWorld, answering reasoning questions in HotPotQA, and programming tasks on HumanEval.

For sequential decision-making tasks in AlfWorld, Reflexion, combined with self-evaluation methods, successfully completed 130 out of 134 tasks, demonstrating a notable improvement over the ReAct framework alone.

In reasoning tasks, Reflexion, when supplemented with episodic memory, outperformed earlier models, proving its effectiveness in these areas.

Programming Achievements
Reflexion has also shown superior performance in writing Python and Rust code across benchmarks like MBPP, HumanEval, and Leetcode Hard.

When to Use Reflexion

Reflexion is especially useful in the following situations:

- Learning from Trial and Error: It helps agents learn by reflecting on past mistakes, making it ideal for tasks requiring iterative improvement.
- Impracticality of Traditional RL: Unlike traditional reinforcement learning methods that require large amounts of data and complex adjustments, Reflexion offers a more efficient approach without needing to fine-tune the underlying models.
- Need for Nuanced Feedback: Reflexion uses detailed verbal feedback, allowing agents to better understand their errors and improve accordingly.
- Importance of Interpretability: Reflexion provides clear and explicit memory that enhances the agent's learning process, making it easier to analyze its decisions.

Effective Tasks for Reflexion

Reflexion works well in:

- Sequential Decision-Making: Enhancing performance in navigation tasks like those found in AlfWorld.
- Reasoning: Improving responses in question-answering datasets such as HotPotQA.
- Programming: Achieving better results in coding challenges like HumanEval and MBPP.

Limitations of Reflexion

Despite its strengths, Reflexion has some limitations:

- Dependence on Self-Evaluation: Its effectiveness relies on the agent's ability to accurately assess its performance and generate useful reflections, which can be challenging for complex tasks.
- Long-Term Memory Limits: Reflexion uses a limited memory capacity, so for more complicated tasks, advanced memory structures may be needed.
- Code Generation Challenges: There can be difficulties in accurately mapping inputs and outputs in code generation, especially with non-deterministic functions and hardware-related influences.
